Default Clustering from Sparse Data Sets
نویسندگان
چکیده
Categorization with a very high missing data rate is seldom studied, especially from a non-probabilistic point of view. This paper proposes a new algorithm called default clustering that relies on default reasoning and uses the local search paradigm. Two kinds of experiments are considered: the first one presents the results obtained on artificial data sets, the second uses an original and real case where political stereotypes are extracted from newspaper articles at the end of the 19th century.
منابع مشابه
Modeling Default Induction with Conceptual Structures
Our goal is to model the way people induce knowledge from rare and sparse data. This paper describes a theoretical framework for inducing knowledge from these incomplete data described with conceptual graphs. The induction engine is based on a non-supervised algorithm named default clustering which uses the concept of stereotype and the new notion of default subsumption, the latter being inspir...
متن کاملClustering of Conceptual Graphs with Sparse Data
This paper gives a theoretical framework for clustering a set of conceptual graphs characterized by sparse descriptions. The formed clusters are named in an intelligible manner through the concept of stereotype, based on the notion of default generalization. The cognitive model we propose relies on sets of stereotypes and makes it possible to save data in a structured memory.
متن کاملDefault Clustering with Conceptual Structures
This paper describes a theoretical framework for inducing knowledge from incomplete data sets. The general framework can be used with any formalism based on a lattice structure. It is illustrated within two formalisms: the attribute-value formalism and Sowa’s conceptual graphs. The induction engine is based on a non-supervised algorithm called default clustering which uses the concept of stereo...
متن کاملMulti-rank Sparse Hierarchical Clustering
There has been a surge in the number of large and flat data sets – data sets containing a large number of features and a relatively small number of observations – due to the growing ability to collect and store information in medical research and other fields. Hierarchical clustering is a widely used clustering tool. In hierarchical clustering, large and flat data sets may allow for a better co...
متن کاملA natural framework for sparse hierarchical clustering
There has been a surge in the number of large and flat data sets – data sets containing a large number of features and a relatively small number of observations – due to the growing ability to collect and store information in medical research and other fields. Hierarchical clustering is a widely used clustering tool. In hierarchical clustering, large and flat data sets may allow for a better co...
متن کامل